Fast subset scan for spatial pattern detection
نویسنده
چکیده
We propose a new ‘fast subset scan’ approach for accurate and computationally efficient event detection in massive data sets. We treat event detection as a search over subsets of data records, finding the subset which maximizes some score function. We prove that many commonly used functions (e.g. Kulldorff’s spatial scan statistic and extensions) satisfy the ‘linear time subset scanning’ property, enabling exact and efficient optimization over subsets. In the spatial setting, we demonstrate that proximity-constrained subset scans substantially improve the timeliness and accuracy of event detection, detecting emerging outbreaks of disease 2 days faster than existing methods.
منابع مشابه
StarScan: A Novel Scan Statistic for Irregularly-Shaped Spatial Clusters
Introduction Kulldorff’s spatial scan statistic1 detects significant spatial clusters of disease by maximizing a likelihood ratio statistic over circular spatial regions. The fast localized subset scan2 enables scalable detection of proximity-constrained subsets and increases power to detect irregularly-shaped clusters, However, unconstrained subset scanning within each circular neighborhood2, ...
متن کاملSupport Vector Subset Scan for Spatial Outbreak Detection
Introduction Neill’s fast subset scan2 detects significant spatial patterns of disease by efficiently maximizing a log-likelihood ratio statistic over subsets of locations, but may result in patterns that are not spatially compact. The penalized fast subset scan (PFSS)3 provides a flexible framework for adding soft constraints to the fast subset scan, rewarding or penalizing inclusion of indivi...
متن کاملFast subset scan for multivariate event detection.
We present new subset scan methods for multivariate event detection in massive space-time datasets. We extend the recently proposed 'fast subset scan' framework from univariate to multivariate data, enabling computationally efficient detection of irregular space-time clusters even when the numbers of spatial locations and data streams are large. For two variants of the multivariate subset scan,...
متن کاملFast generalized subset scan for anomalous pattern detection
We propose Fast Generalized Subset Scan (FGSS), a new method for detecting anomalous patterns in general categorical data sets. We frame the pattern detection problem as a search over subsets of data records and attributes, maximizing a nonparametric scan statistic over all such subsets. We prove that the nonparametric scan statistics possess a novel property that allows for efficient optimizat...
متن کاملFast Multidimensional Subset Scan for Outbreak Detection and Characterization
Objective We present Multidimensional Subset Scan (MD-Scan), a new method for early outbreak detection and characterization using multivariate case data from individuals in a population. MD-Scan extends previous work on multivariate event detection by identifying the characteristics of the affected subpopulation, and enables more timely and accurate detection while maintaining computational tra...
متن کامل